pyAffy: An efficient Python/Cython

نویسنده

  • Florian Wagner
چکیده

10 Robust multi-array average (RMA) is a highly successful method for processing raw data from Affymetrix expression microarrays. However, most of the work on microarray data processing predates the widespread use of Python in scientific computing. Here, I describe pyAffy, an efficient implementation of the RMA method in Python/Cython. Using data from the MAQC project, I show that this implementation produces virtually identical results compared to the RMA reference implementation in the affy R package, while running more than five times faster and consuming significantly less memory. I also show how individual steps of the RMA method affect the final expression estimates. The source code for pyAffy is available from PyPI and GitHub (https://github.com/flo-compbio/pyaffy) under an OSI-approved license. I intend to periodically revise this manuscript to ensure that it accurately reflects the functionalities available in the pyAffy Python package. 11

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The XL-mHG test for gene set enrichment

7 The nonparametric minimum hypergeometric (mHG) test is a popular alternative to Kolmogorov-Smirnov (KS)-type tests for determining gene set enrichment. However, these approaches have not been compared to each other in a quantitative manner. Here, I first perform a simulation study to show that the mHG test is significantly more powerful than the one-sided KS test for detecting gene set enrich...

متن کامل

PyFAI: a Python library for high performance azimuthal integration on GPU

The pyFAI package has been designed to reduce X-ray diffraction images into powder diffraction curves to be further processed by scientists. This contribution describes how to convert an image into a radial profile using the Numpy package, how the process was accelerated using Cython. The algorithm was parallelised, needing a complete re-design to benefit from massively parallel devices like gr...

متن کامل

CyNEST: a maintainable Cython-based interface for the NEST simulator

NEST is a simulator for large-scale networks of spiking point neuron models (Gewaltig and Diesmann, 2007). Originally, simulations were controlled via the Simulation Language Interpreter (SLI), a built-in scripting facility implementing a language derived from PostScript (Adobe Systems, Inc., 1999). The introduction of PyNEST (Eppler et al., 2008), the Python interface for NEST, enabled users t...

متن کامل

A Comparison of Five Programming Languages in a Graph Clustering Scenario

The recent rise of social networks fuels the demand for efficient social web services, whose performance strongly benefits from the availability of fast graph clustering algorithms. Choosing a programming language heavily affects multiple aspects in this domain, such as runtime performance, code size, maintainability and tool support. Thus, an impartial comparison can provide valuable insights ...

متن کامل

Performance of Python runtimes on a non-numeric scientific code

The Python library FatGHol [FatGHoL] used in [Murri2012] to reckon the rational homology of the moduli space of Riemann surfaces is an example of a non-numeric scientific code: most of the processing it does is generating graphs (represented by complex Python objects) and computing their isomorphisms (a triple of Python lists; again a nested data structure). These operations are repeated many t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016